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Abstract 

We propose a network description of large market investments, where both stocks 
and shareholders are represented as vertices connected by weighted links corre- 
sponding to shareholdings. In this framework, the in-degree (kin) and the sum of 
incoming link weights {v) of an investor correspond to the number of assets held 
(portfolio diversification) and to the invested wealth (portfolio volume) respectively. 
An empirical analysis of three different real markets reveals that the distributions 
of both kin and v display power-law tails with exponents 7 and a. Moreover, we 
find that kin scales as a power-law function of v with an exponent /3. Remarkably, 
despite the values of a, (3 and 7 differ across the three markets, they are always 
governed by the scaling relation (3 = (1 — a)/(l — 7). We show that these empirical 
findings can be reproduced by a recent model relating the emergence of scale-free 
networks to an underlying Paretian distribution of 'hidden' vertex properties. 
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1 Introduction 



A fundamental problem in economics is to characterize different systems by 
means of simple and universal features. The power-law form of the statisti- 
cal distributions of many quantities, including individual wealth[l,2,3,4], firm 
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size [5] and financial market fluctuations [6, 7, 8], seems to be one of sucli 'styl- 
ized facts'. As in many other complex systems, the emergence of this behaviour 
can be related to the interactions of a large number of agents[9,10,ll]. On the 
other hand, the recent advances in network theory[12] allow to describe eco- 
nomic systems internally and to characterize them through novel quantities. 
Indeed, the topology of various economic networks, ranging from those formed 
by directors of corporate boards[12,13] to those generated by the strongest as- 
set correlations [14] is again characterized by power-law distributions, in close 
analogy with many other networks (including Internet, WWW and biological 
webs[12]). 

In the present paper we propose a network description of the financial sys- 
tem formed by the assets traded in a stock market and the corresponding 
shareholders. As we show below, we find that shareholding networks are again 
characterized by power-law distributions, which here describe the volume and 
diversification of portfolios. These quantities are the subject of fundamental 
financial issues such as portfolio optimization [15], and our empirical analysis 
reveals that they are related through nontrivial scaling relations. We finally 
show that the above results can be reproduced by a simple network model [16] 
which assumes that the topological properties depend on some non-topological 
quantity (or fitness) which is this case represents the wealth invested by the 
shareholders. 



2 Introducing the shareholding networks 

The data sets we analysed refer to the shareholders of all assets traded in 
the Italian stock market [17] (MIB) in the year 2002, in the New York Stock 
Exchange[18](NYSE) in the year 2000 and in the National Association of 
Security Dealers Automated Quotations[18] (NASDAQ) in the year 2000. The 
number M of assets in the markets is 240, 2053 and 3063 respectively. The data 
necessarily report, for each asset, only a limited number of investors (generally 
holding a significant fraction of shares of it). While this biases the estimate of 
the number of investors of each asset (which can in principle be very large) , it 
does not affect qualitatively the statistical properties of the number of assets 
in the portfolio of each reported investor. 

As well known, it happens frequently that some shareholders of a certain com- 
pany are themselves companies whose shares are traded in the market, so that 
there is a significant fraction of listed companies which are also owners of other 
listed companies. This leads naturally to a network description of the whole 
system (see Fig. la), where the N investors and the M assets are both rep- 
resented as vertices and a directed link is drawn from an asset to any of its 
shareholders (which can be persons or listed companies themselves, therefore 
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Fig. 1. Shareholding networks for the Itahan market: a) the extended net (red 
vertices = stocks, green vertices = shareholders) and b) the restricted one (stocks 
labelled by the name of the corresponding company). Arrow size is proportional to 
the fraction of shares owned. 
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Fig. 2. Cumulative histograms of kin for the extended nets (mai panel, with 
power-law fit) and the restricted ones (inset). 

the total number of vertices is less than N + M). In this topological description 
the in-degree {kin)i (number of incoming links) an the investor i corresponds 
to the number of different assets in its portfolio (which we call the 'portfolio 
diversification'). Vertices with zero in-degree are listed companies holding no 
shares of other stocks. The out-degree kout of a vertex is the number of share- 
holders of the corresponding asset, but as we discussed above this is a biased 
quantity and we cannot deal with its statistical description. We also note that 
a weight can be assigned to each link, defined as the fraction Sij of the shares 
outstanding of asset j held by i multiplied by the market capitalization cj 
of the asset j. The quantity Vi = J2j ^ijCj (hereafter the 'portfolio volume') 
is therefore the total wealth in the portfolio of i. If we consider the subnet 
restricted to the owners which are listed companies themselves (hereafter the 
'restricted' net), we obtain the structure reported in Fig.lb, providing a de- 
scription of the interconnections among stocks. The whole networks will be 
denoted as the 'extended' ones. 

In order to characterize the topology of these systems we consider the statisti- 
cal distribution P>(A;i„) of the number of vertices with in-degree greater than 
or equal to kin- This analysis has been performed on both the extended and 
the restricted nets. As reported in Fig.2, the tail of the distribution computed 
on the extended nets can always be fitted by a power law of the form 



This corresponds (for large values of kin) to a probability density P{kin) oc kl 
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of finding a liolder with a portfolio diversified in exactly kin different stocks. 
The values of the exponent 7 differ across markets: '^nys = 2.37, '~fNAS = 2.22, 
1m IB = 2.97 (however note that in the Italian case the quite large exponent 
1m IB and the small size of the net result in a small value k^"'^ = 19 of the 
largest degree). In the inset of Fig.2 we report the behaviour of P>{kin) com- 
puted on the restricted nets. In this case the situation is very different, and no 
scale-free behaviour is observable. In particular, in US markets the maximum 
in-degree is significantly decreased, while in the Italian one it remains the 
same. This means that in the extended networks describing NYSE and NAS- 
DAQ the tail of P>{kin) is dominated by large investors outside the market, 
while in MIB it is dominated by listed companies, who are the largest holders 
of the market. For the small km region of P-^{kin) the opposite occurs. This is 
reflected in the fact that only 7% of companies quoted in US markets invest in 
other companies, while the corresponding fraction is 57% in the Italian case. 



3 Peireto's law generalized to portfolio volume 

To capture the weighted nature of the networks, we also consider the number 
p>{v) of investors with portfolio volume greater than or equal to v. Once more 
(see Fig. 3a), we find that in all cases the tail of the distribution is well fitted 
by a power law 

p>(^;)oct;i-'^ (2) 

corresponding to a probability density p{v) oc v~°'. The empirical values of 
the exponent are aNYS = 1-95, a^As = 2.09, auiB = 2.24. Note that, since v 
provides an estimate of the (invested) capital, the power-law behaviour can be 
directly related to the Pareto tails[l,2,3,4] describing how wealth is distributed 
within the richest part of the economy. Consistently, note that also the small 
V range of p> (v) seems to mimic the typical form displayed by the left part of 
many empirical wealth distributions [3, 4], whose functional characterization is 
however controversial (log-normal, exponential and Gamma distributions have 
been equivalently proposed[3,4] to reproduce it). Since in the following we are 
interested in the large v and kin limit, the characterization of the left part of 
the distributions is however irrelevant, and we shall only consider the Pareto 
tails and the corresponding exponents. Note that, although the scale-free char- 
acter encapsulated in eq. (1) is already known to be a widespread topological 
feature[12], power-law distributions describing the sum of link weights have 
only been addressed theoretically[19] in the field of complex networks. There- 
fore our mapping of Pareto distributions (well established in the economic 
context[l,2,3,4]) in a topological framework provides an empirical basis for 
the investigation of these specific properties of weighted networks. 
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Fig. 3. a) Cumulative histrograms of v (money units are millions of current US 
dollars, or M$) for the extended nets and power-law fits to the tails, b) Scaling of 
V against kin. The straight lines are the curves v{kin) oc kj^^ with f3 predicted by 
eq. (7), and are not fits to the data. 

4 Scaling of portfolio volume versus portfolio diversification 



We now look for an additional characterization of our system. In particular, 
we ask if any relation between {kin)i and its weighted counterpart Vt can be 
established. If this is the case, then eqs. (1) and (2) are not independent since 
they can be derived from each other through the expression relating v and 
fc*". In a topological context, this directly leads us to the framework explored 
in ref.[16] where the degree of a vertex depends on an associated quantity or 
fitness, which in this case is embodied in fj. In such the connection 

probability is necessarily fitness-dependent and its form -together with that 
of the fitness distribution- determines the topology of the network[16]. Our 
empirical analysis reveals that this is indeed the case. As shown in Fig. 3b, 
we find that v is an increasing function of the corresponding kin, following 
an approximately straight line in double-logarithmic axes. The slope of this 
power-law curve is different across the three markets. However, in the Italian 
case two points deviate from this trend, signalling an anomalous behaviour 
for small {kin < 3) values of the diversification. We checked that these points 
correspond to investors holding a very large fraction (> 50%) of the shares of 
an asset, whose portfolio has therefore a large volume even if its diversification 
is small. Clearly, these investors are the 'effective controllers' of a company. 
While in both US markets the fraction of links in the network corresponding 
to such a large weight is of the order of 10^ (so that their effect is irrelevant 
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on the plot of Fig. 3b), in MIB it equals the extraordinarily larger value 0.13. 
This determines the 'peak' at small kin superimposed to the power-law trend 
in the Italian market, and singles out another important difference between 
MIB and the US markets. 



5 The fitness model with generalized preferential attachment 

The results discussed so far are rather surprising since they show that portfo- 
lio structure is governed by simple laws in each of the three markets, allowing 
for an integrated description of both ordinary investors and companies de- 
spite their investments are expected to be driven by different factors. The 
former are in fact expected -at least within the standard framework of portfo- 
ho selection [15]- to diversify their investments as much as possible in order to 
minimize financial risk, while companies instead organize their portfolios in a 
more focused way in order to establish strategic business alliances. 

Turning to a topological context, we now show that, as anticipated above, the 
observed properties can be reproduced by means of a recent stochastic net- 
work model[16] that introduces a fitness variable characterizing each vertex. 
Although the original model was designed for undirected graphs, it can be sim- 
ply generalized to directed networks as follows. There are two types of vertices 
in the network, which in our case represent the N agents (each characterized 
by its fitness Xi) and the M assets (characterized by a different quantity yj). 
Due to the presence of listed companies acting as both types, the total number 
of vertices does not sum up to + M. We shall regard Xi as proportional to 
the portfolio volume of i, which is the wealth that i decides to invest. The 
quantity i/j can instead be viewed as the information (such as the expected 
long-term dividends and profit streams) associated to the asset j. Note that 
yj can also be a vector of quantities, since the following results can be easily 
generalized to the multidimensional case [20]. A link is drawn from j to i with 
a probability which is a function f{xi,yj) of the associated properties. Note 
that f{x,y) 7^ f{y,x), differently from the undirected case[16]. 

The simplest choice is the factorizable form f{x,y) = g(x)h(y) where g{x) is 
an increasing function of x, which takes into account the fact that investors 
with larger capital can afford larger information and transaction costs and are 
therefore more hkely to diversify their portfohos. The function h{y) encapsu- 
lates the strategy used by the investors to process the information y relative to 
each asset. The stochastic nature of the model allows for two equally wealthy 
agents to make different choices (due for instance to different preferred in- 
vestment sectors), even if assets with better expected long-term performance 
are statistically more likely to be chosen. For large web sizes, the expected 
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in-degree of an investor with fitness x is given by 



kin{x) = g{x)hT 



(3) 



where qt is the total value of g{x) summed over all N agents. If the above 
relation is invertible, and if p{x) denotes the statistical distributions of x 
computed over the N agents, then the in-degree distribution is given by 



Analogous relations for koutiv) and P{kout) can be obtained directly. However, 
since our information regarding k^y^t is incomplete (see above), we cannot test 
our model with respect to the function h{y), and in the following we shall only 
consider the quantities derived from g{x). 

Note that the above mechanism differs from those explored in most net- 
work models[12], where new vertices are continuously added and preferentially 
hnked to pre-existing ones with large degree k ('preferential attachment' rule). 
In the latter case, the functional form of the degree-dependent attachment 
probability can be measured[12] in real evolving networks, and is found to 
be proportional to k ('linear preferential attachment') or more generally to 
/c^ ('nonlinear preferential attachment'). Here, the attachment mechanism is 
'preferential' with respect to the variable x, and not to the pre-existing ver- 
tex degree. Within this 'generalized preferential attachment' framework, the 
analogous choice for the connection probability is then g[x) = cx^ with P > 0, 
where c is a normalization constant ensuring < g{x) < 1 {a. possible choice 
is c = x^l^, so that by defining x = v/vmax we can directly set c = 1). It is 
straightforward to show that the predicted expressions (3) and (4) now read 

kin{x) oc X^ (5) 



where we have used the fact that p{x) oc for large x. Note that the above 
results still hold in the more general case when f{x, y) is no longer factorizable 
provided that kin{x) = M J f{x, y)a{y)dy oc x^ as in eq. (5), where (j{y) is the 
distribution of y computed on the M assets. 



6 Discussion and concluding remarks 

The empirical power-law forms of p{x), kin{v) and P{kin) are therefore in 
qualitative agreement with the model predictions. Moreover, by comparing 



P{kin) = p[x{kin)]dx{kin)/dk,, 



'in 



(4) 



P{kin) OC k^^ ^"^^^ 



(6) 
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eqs. (1) and (6) we find that the model predicts the following relation between 
the three exponents a, (3 and 7: 

/3 = (1 - a) /{I - 7) (7) 

By substituting in the above expression the empirical values of a and 7 ob- 
tained through the fit of Figs. 2a and 3a, we obtain the values of (5 correspond- 
ing to the curves v{kin) oc /c]^^ shown in Fig. 3b, which simply represent the 
inverse of eq.(5) in terms of the quantity v. Remarkably, the curves are all in 
excellent agreement with the empirical points shown in the same figure, except 
the 'anomalous' points of MIB. This suggests that the proposed mechanism 
fits well the investors' behaviour, apart from that of the effective holders of a 
company. 

A final comparison with the 'traditional' preferential attachment mechanism is 
again revealing. Note that here we always observe the analogous of a super lin- 
ear (/3 > 1) preferential attachment. However, while the traditional mechanism 
yields scale- free topologies only in the linear case [12], here we observe power- 
law degree distributions in the nonlinear well. This is a remarkable 
result, since in order to obtain the empirical forms of P{kin) the exponent 
13 does not need to be fine-tuned, and the results are therefore more robust 
under modification of the model hypotheses. Also note that, interestingly, 
Pareto's law of wealth distribution has also been proposed [2, 21] as a possible 
explanation for the 'fat tails' observed in financial markets fluctuations. In a 
network context, the above results support the hypothesis that the presence 
of non-topological quantities associated to the vertices may be at the basis 
of the emergence of complex scale-free topologies in a large number of real 
networks[16]. 

Authors acknowledge EU FET Open Project IST-2001-33555 COSIN for sup- 
port and E. Sciubba, F. Lillo and M. Buchanan for helpful discussions and 
comments. 
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